【Day 02】生成式 AI 入門指南

2024 iThome 鐵人賽

DAY 2

生成式 AI

T 大使 AI 之旅系列第 2 篇

16th鐵人賽

我的狗狗叫饅頭

2024-08-06 23:58:07

717 瀏覽

分享至

前情提要

前一篇文章提到我這次挑戰會專注在生成式 AI 文本生成的部分。所以就來看看大型語言模型吧！

大語言模型 Large Language Model (LLM)

我在使用大語言模型時，根據不同情況取得我要使用的模型，也分別透過以下方式取得：

本地環境：ollama

ollama 是一個開源的軟體，在 ollama 上的所有模型都是可以下載下來使用的，包含像是以下幾個比較多人聽過的 LLM。ollama 上使用 LLM 是完全不需要網路的，可以直接在自己的硬體上運行。

Meta - llama 3.1
Google - gemma 2
Microsoft - phi 3
Mistral AI - mistral

實作範例 - 以 llama 3.1 為例

2024-08-06 22.47.04

Hugging Face

Hugging Face 被稱為「AI 界的 Github」，最大亮點是開源模型集散地，整合大量的模型、資料集。他還有其他很多的服務，有興趣的朋友自行去玩玩看！另外有些 LLM 會需要使用 Hugging Face Token 才可以做使用，但其實就只要註冊帳號而已，都是可以免費使用的。基本上只要是開源的模型，來 Hugging Face 一定找得到。像是前面提到的 llama、mistral、gemma等等，比較早期的 T5、bert，台灣本土的訓練出來的有 breeze (聯發科)、taide(國科會)、TAME (由多家企業於台大資工資管合作)。

實作範例 - 以 TAME (Taiwan Mixture of Experts) 模型為例

import torch
from transformers import pipeline, StoppingCriteria

# Define a custom stopping criteria class
class EosListStoppingCriteria(StoppingCriteria):
	def __init__(self, eos_sequence=[128256]):
		self.eos_sequence = eos_sequence
	def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
		last_ids = input_ids[:, -len(self.eos_sequence):].tolist()
		return self.eos_sequence in last_ids

# Initialize the model with automatic device mapping
llm = pipeline("text-generation", model="yentinglin/Llama-3-Taiwan-8B-Instruct", device_map="auto", token="your hugging face token")
tokenizer = llm.tokenizer

chat = [
	{"role": "system", "content": "你是一位道地的台灣人，了解台灣的社會文化和俚語。"},
	{"role": "user", "content": "台灣8+9是什麼？"}
	]

flatten_chat_for_generation = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# Generate a response using the custom stopping criteria
output = llm(flatten_chat_for_generation, return_full_text=False, max_new_tokens=128, top_p=0.9, temperature=0.7, stopping_criteria=[EosListStoppingCriteria([tokenizer.eos_token_id])])
print(output[0]['generated_text'])

Output
CleanShot 2024-08-06 at 20.04.33@2x

API Key

使用 API Key 呼叫 LLM 的方式應該是許多人最常聽到的，因為如果不要從 UI 介面與 ChatGPT 互動，就需要 API 才可以。

Google 的 Gemini 也是透過 API Key 來做互動！

實作範例 - 以 curl 搭配 OpenAI API Key 為例

curl https://api.openai.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer $OPENAI_API_KEY"   -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "system",
        "content": "You are a Kpop fans."
      },
      {
        "role": "user",
        "content": "Can you tell me the members in aespa?"
      }
    ]
  }'

Output

台智雲🇹🇼

前面講了那麼多 LLM，但都不是我要做為此次鐵人賽的 LLM。我這次主要使用的模型是台智雲的模型，他們整合了非常多的 LLM，經過他們的訓練，打造台灣在地化語言模型（點我看可使用模型）。他們的模型也是使用 API Key 的方式調用，個人帳號註冊後有兩個月時間或 10,000 元的額度可以使用。

給 AI 的指令用中文理解度超高，台灣本土模型就是讚 ❤️

實作範例 - 台智雲

import json
import requests

MODEL_NAME = 'MODEL NAME'
API_KEY = 'API KEY'
API_URL = 'API URL'

max_new_tokens = 500
temperature = 0.01
top_k = 10
top_p = 1
frequence_penalty = 1.03

def conversation(system, contents):
	headers = {
		"content-type": "application/json",
		"X-API-KEY": API_KEY,
		"X-API-HOST": "afs-inference"}

	roles = ["human", "assistant"]
	messages = []
	if system is not None:
		messages.append({"role": "system", "content": system})

	for index, content in enumerate(contents):
		messages.append({"role": roles[index % 2], "content": content})

	data = {
		"model": MODEL_NAME,
		"messages": messages,
		"parameters": {
			"max_new_tokens": max_new_tokens,
			"temperature": temperature,
			"top_k": top_k,
			"top_p": top_p,
			"frequence_penalty": frequence_penalty
		}
	}
	
	result = ""
	try:
		response = requests.post(API_URL + "/models/conversation", json=data, headers=headers)
		if response.status_code == 200:
			result = json.loads(response.text, strict=False)['generated_text']
		else:
			print("error")

	except:
		print("error")
	return result.strip("\n")

system_prompt = "你是一位 Kpop 的粉絲。"
contents = ["Aespa是誰？", "是由韓國娛樂公司 SM 娛樂創立的韓國女團。", "她們的成員有誰？"]
result = conversation(system_prompt, contents)
print(result)

Output
CleanShot 2024-08-06 at 23.48.28@2x